-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Qurro to support pandas v1 and up, and thus newer versions of QIIME 2 #322
Conversation
Maybe we can make another GitHub Actions for these later; but Songbird is causing tensorflow nonsense to pop up, and this is not the sort of thing I think we should spend time fixing (esp with the advent of birdman)
See biocore#258 and biocore#315. not confident this is done yet (and if nothing else the rest of the code gleefully refers to "SparseDataFrame" because 2019 marcus was a schmuck), but this at least fixes a fair amount of failing tests
The problem was using .loc[] on these sparse dataframes. whoops
since apparently it's deprecated, or about to be deprecated, idk
most of these warnings were just pd.DataFrame.append() being deprecated and replaced with pd.concat()
looks like it's a tiny floating-point thing -- probably an artifact of working here on a new operating system, on a new python version, a new pandas version, a new biom version, etc. shouldn't make a noticeable difference
IIRC something about how our specific altair version works makes it incompatible with python 3.10. let's test that here -- if needed, we can update the README to disallow python versions >= 3.10. (And then we can look into removing the altair pin when absolutely needed.)
Codecov Report
@@ Coverage Diff @@
## master #322 +/- ##
==========================================
- Coverage 95.09% 85.53% -9.57%
==========================================
Files 19 19
Lines 1408 1417 +9
Branches 234 237 +3
==========================================
- Hits 1339 1212 -127
- Misses 57 193 +136
Partials 12 12
Continue to review full report at Codecov.
|
Honestly I think we should just not run the notebooks and keep them static. Dependencies issues are going to accumulate and I don't think we should be expected to maintain them continually specifically for notebooks. Maybe just add a note that dependencies may have changed since creation. (Willing to change my mind on this, though.) |
Songbird and ALDEx2 ones will cause problems
since i thiiiink these versions mighta worked with the pandas >= 1 syntax that being said, i don't think it makes sense to devote time/energy to officially supporting them; just wanna check
That's fair -- this is definitely the more sustainable way to handle this :) That being said, I figure things are pretty stable for the time being: after looking into things, the bulk of the problems wound up coming from Songbird now being incompatible with Qurro's environments. In my opinion, supporting the combo of Songbird + Qurro is important, so I took some time and updated the "Red Sea" notebook to explain how to run Songbird and Qurro in separate conda environments. That fixed most of the problems (the other main thing was updating the ALDEx2 notebook, which wasn't too bad); the notebooks are now all up-to-date (and, as a nice bonus, now the users don't see all of those pandas warnings about This PR should now be ready for review / merging. |
Looks like the tests themselves pass for these versions, but the style-checking with black fails due to incompatibility with click. yeah this is enough for me to not bother supporting these versions imo
the apocalypse came and all i got was this pull request
about about about about aboot
Port CI from Travis to GitHub Actions (close Switch to GitHub actions for CI #316)
matrix
functionality in GitHub Actions to automatically test on multiple QIIME 2 versions.Updated the code to not use
pd.SparseDataFrame
, and instead usepd.DataFrame
with sparse arrays (close Support pandas v1: Switch from SparseDataFrame to "regular ... DataFrame with sparse values" #258, close Qurro will break on newer (>= 2020.11) versions of QIIME 2 #315)biom.Table.to_dataframe()
does all the heavy lifting. Previously, Qurro was actively avoiding using that function due to a bug with it ([python]to_dataframe
does not produce sparse data frames biom-format#808), but -- now that that bug is fixed in all biom versions that Qurro accepts -- we can just delegate to it.Updated the code to remove various sources of warnings. A lot of things have become deprecated in the past few years (e.g.
pd.DataFrame.append()
, using aset
as an indexer in pandas,np.matrix
, ...)Updated the documentation regarding the minimum Q2 version stuff
This PR is almost ready to go -- the main things I need to fix are